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Abstract 

A number of new functionalities have been added to the Alloy Theoretic Automated Toolkit (ATAT) 
since it was last reviewed in this journal in 2002. ATAT can now handle multicomponent multisublattice 
alloy systems, nonconfigurational sources of entropy (e.g. vibrational and electronic entropy), Special 
Quasirandom Structures (SQS) generation, tensorial cluster expansion construction and includes inter- 
faces for multiple atomistic or ab initio codes. This paper presents an overview of these features geared 
towards the practical use of the code. The extensions to the cluster expansion formalism needed to cover 
multicomponent multisublattice alloys are also formally demonstrated. 

1 Introduction 

The Alloy Theoretic Automated Toolkit (ATAT) [1-3] is a suite of software tools facilitating the determi- 
nation of thermodynamic properties of solid state alloys from first-principles calculations. It relies upon the 
cluster expansion formalism [4-11] to build a simplified effective Hamiltonian that accurately reproduces 
quantum mechanical calculation results for an alloy system of interest and that can be used to efhciently 
calculate their thermodynamic properties. ATAT is freely distributed and open-source, thus encouraging 
users to contribute [12]. 

ATAT's basic functionalities have been described in an earlier issue of the CALPHAD journal [1]. Since 
then, a number of new features have been added and the purpose of the present paper is to describe these 
additions and provide a concise "user guide" for these new features. The most significant additions to ATAT 
are 

1. extensions to multicomponent multisublattice systems; 

2. the inclusion of nonconfigurational sources of entropy (vibrational and electronic entropy); 

3. Special Quasirandom Structures (SQS) generation; 

4. tensorial cluster expansion construction; 

5. support for multiple atomistic (ab initio) codes; 

6. various new conversion and analysis utilities. 

The features presented here are not exhaustive and new features are continuously being added. Hence, 
the reader is invited to consult the manual supplied with the ATAT distribution or posted on the ATAT web 
site ( http://alum.mit.edu/www/avdw/atat ) for the most up to date information. 



2 Multicomponent multisublattice systems 

2.1 The cluster expansion in multicomponent multisublattice systems 

The traditional cluster expansion represents the relationship q (cr) between a configuration a and some 
scalar intensive quantity q. While multicomponent alloys have been covered since the original introduction 
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Table 1: Example of a lattice, based on the Martensitc system, written in ATAT's "lattice input file" format. 

1 1 1 . 05 90 90 90 (Tetragonal coordinate system in a b c a P j notation) 

0.5 0.5 0.5 (Primitive unit cell: one vector per line 

0.5 -0.5 0.5 expressed in multiples of the above coordinate 

0.5 0.5 -0.5 system vectors) 

Fe,Ni,Cr (Ternary sublattice) 

0.5 0.5 C.Vac (Binary sublattice, where Vac stands for "vacancy" , 

0.5 . 5 C , Vac with a second symmetrically equivalent site 

0.5 0.5 C , Vac and a third inequivalent site due to the tetragonality) 



of the cluster expansion [4], handling multisublattice systems (in which different sites in the unit cell can 
host different binary sets of atoms) is a more recent extension [13]. ATAT implements a superset of these 
two formalisms, allowing different sites to host an arbitrary (and potentially different) number of elements, 
some of which may be common across sublattices. These sublattices may or may not consist of sets of 
symmetrically equivalent sites. To fix the ideas, refer to Table 1 for an example exploiting all of these 
extensions. The multicomponent multisublattice features of ATAT have been used, for instance, in [14-16]. 

The remainder of this section introduces the requisite extensions to the cluster expansion formalism. The 
reader is referred to earlier reviews on the topic [4-11] or to the first part of this manual [1] for a gentler 
introduction to the topic. 

A configuration a is represented by a vector of occupation variables cr^ indicating which type of atom sits 
on lattice site i. In an alloy where site i can host Mj components, could take any value from to Mj — 1. 
This type of representation of an alloy is also known as a lattice gas model. The CE takes the general form 

9 (^) = X] ^»Ja {Ta' (o-))q; (1) 

where we have used the following definitions: 

• a is a cluster. In the simple case of a binary alloy, a cluster can be described by a vector of elements 
tti equal to one or zero depending whether site i belongs to the cluster or not. If site i can host 
components, can take any values from to Mj — 1, with indicating that site i does not belong 
to cluster a, while the other positive values reflect various possible functional form dependence of the 
energy of cluster a on the occupation cjj of site i. 

• The sum is over all possible clusters a that are mutually symmetrically distinct under the space group 
of the underlying lattice.^ Note that the determination of the space group of the lattice must account 
for the fact that sites hosting different sets of atoms are considered symmetrically distinct. 

• The average (• • • )(j is over all clusters a' that are equivalent by symmetry to cluster a. 

• Fq/ (cr) are cluster functions. They are selected to be of the form 

(<7) = JJ.Tai.Mi (o-j) (2) 

where 7aj,Mi (cj) satisfies 70, Mi (cj) = 1 and the following orthogonality condition 

^ E (^') = { J otherwife " 

(In binary alloys, a common choice is 70,2 (0) = l,7o,2 (1) = l,7i,2 (0) = — 1,71,2 (1) = +!•) Although 
the product (2) is, in principle, over all lattice sites, the choice 70, (cj) = 1 ensures that it reduces 

to a product over sites within cluster a only. 

• ma are multiplicities indicating the number of clusters (e.g. per unit c;cll) equivalent to a by symmetry. 

• Joe are expansion coefficients to be determined. They are also called Effective Cluster Interactions 
(ECI). 

^Hence, out of each set of equivalent clusters, one representative cluster is kept. 
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It can be shown that when all clusters a are considered in the sum, the cluster expansion is able to 
represent any function q (a) of configuration a by an appropriate selection of the values of Ja ■ The crux 
of the argument leading to that conclusion lies in establishing that the (cr) form an orthogonal basis for 
the space of functions of configurations according to a suitably defined inner product. In the general case 
considered here, the original proof [4] needs to be extended as follows. Given two cluster functions and 

associated with two corresponding clusters a and /3, define the inner product: 

(r„,r^) = 5^r„ {a)Tp {a) 

a 

where the sum is over all possible configurations u (i.e., each Oi ranges from to Mj — 1 in lexicographic 
order). We can then note that 

^ Mi-1 ^ M2-I 

= Ml s M2 ^ •••ni(T«-^.(^^)m.^.(^^)) 

(7l=0 (72=0 



= n. 



^ Mi-1 

7T, X] {lai,Mi{(^i)lPi,Mi{<^i)) 



.-0 

(7i=\J 



where the term in bracket is zero for any i such that ai ^ (3i by Equation (3). If no such i exist, then 
each term in bracket is equal to 1. It follows that (rQ,r^) = if the clusters a and /3 differ (oven by a 
single site) and is equal to 1 if they are identical. This establishes that the Ta are orthogonal. Moreover, 
there are as many different a as there are different a (as a, and cTj both can take on Mi distinct values), 
so the "number" of equals^ the "dimension" of the configurational space a. It follows that the F^ form 
a complete orthogonal basis. This result is not obvious when lattice sites may host a different number of 
species (i.e. when Mj is not constant). In that case, the cluster functions (Equation (2)) may involve mixed 
products between entirely different types oijai,Mi (o'i). 

While, the above discussion is valid for any 7ai,Mi (cj) satisfying (3), a specific "^cuMi i^i) i^ust be selected 
in practice. By default, the ATAT uses the following choice: 

1 if ai = 

,Mi {(^i) = { ~ cos(27r [^] jj) if ttj > and odd 



— sin(27r \^ \ j^) if > and even 

where [. . .] denotes the "roimd up" operation and where both ai and cTi can range from to Mi — 1. Note 
that these functions reduce to the single function — (—1)°^' in the binary case {Mi = 2). User-specified 
lai,Mi i'^i) can be provided by (i) editing and following the instructions given in the corrskel . C++ file and 
by (ii) specifying the -erf = [keyword] option on the command line of ATAT commands, where [keyword] 
is a unique identifier. 

Another nonobvious aspect of the above cluster expansion generalization is that the relationship between 
the point correlations and the composition variables is no longer as trivial as in the well-known single 
sublattice binary case. In general, there are three different composition-type quantities to consider, each 
having a specific use. 

1. The point correlation vector p contains all the expected values of the cluster functions (r„ (cr)) asso- 
ciated with a cluster a such that ai is nonzero for a single site i. They enter the expression of the 
cluster expansion which is used to predict energies either during the ground state search procedure or 
in the Monte Carlo simulations. 

2. The "nonredundant" concentration vector c are those that remain after all linearly dependent con- 
centrations have been eliminated (of course, there are multiple valid choices of nonredundant concen- 
trations — one convention must be arbitrarily selected). These linear dependencies arise from the 



^This statement can be made more rigorous by considering a sequence of finite system (with periodic boundary conditions) 
whose size increases to infinity. 
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constraint that the compositions on each sublattice sum up to one. The nonredundant concentrations 
are used for the convex hull construction in the ground state search procedure. 

3. The full vector x of concentrations has no algorithmic use but provides the most user- friendly output. 
These quantities are linearly related: 

c = Cp + co (4) 
X = Xp + xo (5) 

where C and X arc constant matrices while cq and xq arc constant vectors. In general, the matrices C and 
X are rectangular and not invertible. (These matrices can be output within ATAT with the corrdump -pciii 
command, if the user provides a lattice input file lat . in.) Appendix A describes how these matrices are 
calculated. 

The matrix X plays a special role in the conversion of chemical potentials fi into shifts of the effective 
cluster interactions Jq,, as would be needed in a grand canonical Monte Carlo simulation. Note that the 
chemical potential contribution to any of the grand canonical thermodynamic functions can be written as 

fji^x = n^Xp + pFxo = {X'^jj)^ p + pFxo = ^ Pj + fJ-^xo 

j 

from which it can be inferred that h^xq is the shift induced in the empty cluster ECI, while the j'-element 
of the vector X-'^p, is the shift induced in the ECI associated with point correlation pj. 

2.2 Multicomponent cluster expansion construction 

The mmaps command is the multicomponent version of the maps command (discussed in [1,2]) and works 
in a similar fashion. It gradually constructs an increasingly more accurate cluster expansion by repeatedly 
invoking a first-principles code through auxiliary scripts (see Sections 3, 6 and the documentation of the 
pollmach and runstruct_xxxx commands for details). The main difference between the mmaps and maps 
codes is that, given the larger number of components allowed in mmaps, the input and output files have to 
contain more information. The main input file (typically called lat. in), which describes the lattice, is a 
natural extension of the maps input file, as shown in Table 1. 

A second, optional, input file (typically called orange . in) provides the ability to specify a region to 
explore in composition space. This is useful in systems where the provided lattice would be mechanically 
or thermodynamically unstable beyond certain composition limits. For instance, given the lattice shown in 
Table 1, one could specify: 

l*Ni+l*Cr<=0.2 

1*C<=0.1 

where each atom name stands for its overall concentration (it is not the concentration on a sublattice). 
Only linear constraints are supported (and are implicitly combined with a logical "and" , so that the allowed 
regions are necessarily convex). 

This file not only controls the range of concentration of the structures generated by mmaps, but also the 
composition range where the correct ground states are enforced in the fitting process. Note that, occasionally, 
structures outside the specified range may be generated in order to ensure that the ground states found within 
the specified composition range are not masked by ground states outside of it (mmaps internally generates 
structures at all compositions in order to check for this). 

Suitable composition limits are not always known in advance. When these limits arise due to mechanical 
instabilities, one would notice that many of the structures generated by mmaps relax to very different types of 
lattices. A symptom of this is the inability to converge the cluster expansion. A way to check for this problem 
is to use the checkrelax utility, which calculates the mean square of the relaxation strain tensor elements 
for each of the structures generated by mmaps. Note that, by design, this command ignores the isotropic 
component of the strain. It compares relaxed geometries in the */str_relax . out files to the corresponding 
unrelaxed */str.out files. Any value above 0.1 should be regarded as suspect. While having a few isolated 
overrelaxed structures is not typically a problem, if virtually all structures beyond a certain composition 
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limit are relaxing excessively, then it is advisable to specify a suitable creinge . in file and restart mmaps. It 
is also recommended to "hide" any overrelaxed structures by issuing the command 

touch [structure] /error 
for each [structure] exhibiting overrelaxation (typically [structure] is a number referring to one of the 
subdirectories generated by mmaps and containing a structure's geometry). This command creates files called 
error in the specified directory that mmaps will perceive as a signal to ignore the structure in question. 

The relevant composition region is sometimes limited by thermodynamic instabilities. This would be 
detected, for instance, by doing a separate cluster expansion on a different lattice or by computing the 
formation energies of known compounds exhibiting a different lattice. In such cases, however, it is not 
essential to limit the compositions range, since the cluster expansion formalism remains valid for metastable 
alloys. However, unless the metastable region is of direct interest, a more rapidly converging cluster expansion 
may be obtained by specifying a suitable orange . in file and by excluding non-ground state structures in 
the metastable region with the touch command, as explained above. 

It is sometimes tempting to perform a cluster expansion under equality — rather than inequality — 
composition constraints. This is relevant for ionic systems, where it may be most natural to only keep 
structures that are "charge balanced" under the assumption that all species keep their nominal charges. 
This is not implemented in mmaps, because it is very risky approach. A cluster expansion providing an 
excellent fit to charge-balanced structures may inadvertently predict non-charge-balanced ground states. A 
way to prevent such problems [13] is to restrict compositions to lie in a nonzero- width slice around the 
charge-balanced hyperplane, as in the following example for the Samarium-doped Ceria system [14] : 

-2*0+4*Ce+3*Sm<=0 . 2 

-2*0+4*Ce+3*Sm>=-0 . 2 

mmaps will, if necessary, sample structures outside of the specified range to ensure that no spurious non- 
charge-balanced ground states exist. 

Compared to maps, the mmaps code has a number of new output files. The atoms . out file lists all atomic 
species given in the input files in the order that will be used to report composition variables in all output 
files. The ref .energy . out file reports the atomic reference energies used to calculate formation energies in 
all output files (in the same order as in atoms. out). These reference energies are the energies of the pure 
elements, if they are included in the fit. Otherwise, if composition constraints are imposed,^ the structures 
with the most extreme compositions^ are used as reference. These non-standard references are converted 
into atomic energies before being output in ref .energy . out. This default behavior can be overridden by 
providing user-specified reference energies in a file called ref .energy . in. 

While the ground state compositions and energies are output in gs . out, the file gs.connect . out provides 
the additional information needed, in multicomponent systems, to describe how these ground states are 
connected by tie lines. Each facet of the ground state convex hull corresponds to a line in the gs.connect . out 
file listing the structure names associated with the vertices of that facet. An example of a convex hull in 
energy-composition space constructed with ATAT is shown in Figure la). 

Setting up grand-canonical Monte Carlo simulations (as discussed in the next section) demands the 
knowledge of the chemical potentials that stabilize various phase equilibria of the system. Since this is not 
obvious to figure out in a multicomponent system, mmaps generates a file chempot . out containing: 

1. The values of the chemical potentials that stabilize each fixed-composition multiphase equilibria of the 
system at absolute zero (e.g. in a n-nary system, all the n-phase equilibria). Each vector of chemical 
potential correspond to one facet of the ground state convex hull. 

2. For each ground state, a value of the chemical potentials that stabilize it at absolute zero. This is not 

a uniquely defined quantity — many possible chemical potentials values are able to stabilize a given 
ground state. Among the many possible values, mmaps calculates one by simply averaging the chemical 
potentials associated with all facets that intersect a given ground state. 

The remaining output files are the same as in maps, except that multiple concentration variables are out- 
put whenever necessary: maps. log, fit. out, predstr.out, gs.out, gs_str.out, eci.out, clusters. out. 

^or if some elements are excluded from some sublattiees, thus effectively restricting the composition range. 
"^Specifically, those with the largest sum of squared concentrations. 
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Figure 1: Computational thermodynamic study of the Samarium-doped Ceria system [14]. a) Example 

of ground state search via a convex hull construction in energy-composition space, b) Equilibrium com- 
position profile across (100) interface obtained through a hybrid canonical/grand-canonical multisublattice 
multicomponent Monte-Carlo simulation. Inset: Snapshot of the 70,000-atom simulation cell used. 
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Table 2: Example of control, in file indicating a scan of the region in (T, /zi, /Lt2, /Lt3)-space defined by 

100 < T < 200 and < < 1.0 and < < 0.5 and /is = 0.0 with a 10 x 5 x 5 grid. 

100 0.0 0.0 0.0 

200 0.0 0.0 0.0 10 

100 0.0 0.5 0.0 5 

100 1.0 0.0 0.0 5 

2.3 Hybrid Multicomponent Monte Carlo simulations 

The memc2 code implements general hybrid canonical/grand-canonicallattice gas Monte Carlo simulations in 
multicomponent systems. As most of the features are similar to the previous binary version, emc2, described 
in [1,3], we focus here on the differences. 

To function, this code requires at least 4 input files: lat . in (identical to the one input into mmaps), 
clusters. out, eci.out, gs_str.out, where the latter three are generated by mmaps. Given the larger 
number of composition variables, the range of temperatures and chemical potentials scanned in the course of 
grand-canonical run are no longer specified on the command-line. Instead, the control . in input file must 
be provided in the following format. The first line specify the initial conditions and has the form (for an 
n-component system): 

[T] [/ii] ... [/i„] 

where T is temperature and /ii, . . . , /U„ are the chemical potentials for each chemical species, in the same 
order as in the atoms. out file generated by mmaps. ^ Each subsequent line of this file indicates one of the 
axes along which to scan and has the format: 
[T] [/ii] ... [/ij [s] 

where s is the the number of steps made between the initial conditions and the final conditions given on the 
line. Table 2 gives an example of such file. 
A few important notes: 

1. Temperatures arc given in units of energy unless the -k or -keV options are used (the latter indicates 

temperature in K if energies arc in eV). 

2. By default, the finals conditions are excluded from the scan (in the example above the values of the 
first chemical potential scanned (see the last line) are 0.0 0.2 0.4 0.6 0.8). This behavior can be changed 
with the -il option, to give: 0.0 0.25 0.5 0.75 1.0). Alternatively, the -hf option gives: 0.1 0.3 0.5 0.7 
0.9. 

3. The mmaps code generates a file called chempot . out which contains special values of the chemical 
potential that stabilize various types of equilibria. These values are useful guidelines to select relevant 
regions in /x-space to scan. 

Canonical or hybrid canonical/grand-canonical simulations can be carried out by specifying, in a file called 
conccons . in, a set of linear constraints on the composition that must hold throughout the simulation. The 
code automatically generates a complete list of allowed multi-site Monte Carlo moves. It does so by first 
considering all possible single-atom identity flips (fc = 1), then all possible two-atom identity flips (fc = 2), 
etc. Any move violating the composition constraints is eliminated. After all fc-atom flips that satisfy the 
constraints have been considered, the code calculates the linear subspace spanned by the composition changes 
associated these flips. The process stops if this linear subspace has the same dimension as the space of allowed 
compositions. The moves generated by this process are guaranteed to satisfy detailed balance because for 
every move kept, its reverse move is also kept. 

The format of the conccons . in flle is similar to the crange . in flle of mmaps. For instance: 
-2*0+4*Ce+3*Sm=0 

would ensure overall charge balance in the fluorite-type CexSmi_a;0j^Vac2-j/ system. Constraining com- 
position variables is particularly useful to pin down the location of an interface in a multiphase interfacial 

^Alternatively, see the ouput of corrdump -pcm. 
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thermodynamic calculation, because the phase fractions are uniquely determined by the overall composition. 
An example of this use is illustrated in Figure lb). 

Any number of constraints can be given up to the point where all concentration variables arc fixed, which 
results in a fully canonical simulation. Note that constant composition simulations do not evade the need 
to make sure that the cluster expansion does not exhibit spurious ground states away from the specified 
hypcrplanc. If such ground states exist, the Monte Carlo code will reveal a phase separation into regions 
that do not locally satisfy the composition constraint, although the overall simulation cell does. Note that 
the constant specified on the right-hand side of the conccons . in file is ignored — this file constrains changes 
in composition only. The user-supplied initial atomic configuration (via the -g= [ground state number] or 
-is=[file] options) implicitly determines the right-hand side constants. 

The ineinc2 code outputs all of thermodynamic quantities it computes on the standard output and in a log 
file mc . out. Given the large (and variable) number of variables in a multicomponcnt system, the code writes 
a mcheader . out file indicating the content of each column of the main output file. By default all output 
quantities are grand-canonical in nature (i.e. they contain a "— /i • x" term, where x is composition and fi 
chemical potential) , regardless of the presence of composition constraints. To obtain "canonical" rather than 
grand-canonical quantities use the -g2c option. 

In addition to performing Monte Carlo simulations, the ineinc2 code reports thermodynamic quantities 
calculated via the low-temperature expansion (indicated by a _lte suffix ) or the mean-field approximation^ 
(indicated by a _mf suffix ). A useful feature is the ability to skip Monte Carlo simulations whenever the free 
energies obtained via mean-field approximation and the low temperature expansion fall within a specified 
threshold (using the -mft= [tolerance] option), as this indicates that cither one is sufficiently accurate. 
These approximations are also used to set initial values of the free energy when performing thermodynamic 
integrations. By default, if the initial configuration is a ground state (-gs= [a positive integer] ) the low- 
tcmperatiirc expansion provides the starting point while Monte Carlo averages arc used to update the free 
energy as temperature and chemical potentials are varied. If the initial configuration is the fully disordered 
state (-gs=-l) a mean-field approximation (i.e. high-temperature expansion) is used as the starting point 
instead. 

Finally, it is possible to specify temperature-dependent interactions (in a file called teci.out, which, 
when present, takes precedence over eci.out). This feature can be used to account for nonconfigurational 
sources of entropy through a coarse-graining scheme [17, 18] . Introducing temperature-dependent interactions 
is not completely trivial, as various averages (such as the internal energy E) must correctly account for the 
temperature dependence of the interactions in order for the following thermodynamic relationship (upon 
which thermodynamic integration is based) to hold: 



where /3 = (fc^T) and F is the Helmholtz free energy. To illustrate this, we express the free energy F in 
terms of the partition function and the coarse grained free energy'' Fi of each microscopic state i: 



Hence, the quantity to average over the Monte Carlo steps is [Fi + f3 {dFi/d/3)] instead of just Fi, as a naive 
analogy with the constant interaction case would have suggested. 

The next section describes how to proceed to obtain suitable temperature-dependent interactions that 
model vibrational and electronic excitations. 

^Whilc the emc2 code also reports the high-temperature limit, this was abandoned in the memc2 code since it is identical the 

mean-field values for the pure phases. 

^In a coarse-graining framework, each microscopic configurational state i has a free energy that can be expressed in terms 
of a partition function summing over all vibrational and electronic excitations j associated with a given configuration i on the 



d{0F) 
df3 
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dp 




j:,[Fi+(3{dF,/df3)]e-^^^ 
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3 Nonconfigurational sources of entropy 



3.1 Lattice vibrations 

The ATAT software offers two techniques to ealculate vibrational free energies. 

The f itf c code constructs a Born- von Karman spring model by fitting the reaction forces resulting from 
imposed atomic displacements in a supercell calculation [17, 19,20]. This is the preferred method^ for very 
accurate calculations, since the range of interactions of the model's effective springs can be increased until 
the desired accuracy has been reached. However, it is computationally demanding to calculate vibrational 
free energies for a large number of structures in this fashion (as would be required for cluster expansion 
construction). 

For this reason, ATAT includes an alternative method: The stiffness- vs. -length scheme [17], which is 
implemented in the fitsvsl and svsl codes. This approach exploits the fact that the effective springs 
of a Born-von Karman model arc often transferable from one structure to another (after controlling for 
simple factors such as bond length [17,27-29], or composition [30]), thus reducing the number of spring 
constant calculations needed. This method's accuracy is limited by the transferability assumption itself, 
which typically does not extend beyond nearest neighbor springs. Fortunately, this assumption is verifiable 
and we shall see how. 

3.1.1 The fitfc code 

Calculations with the fitfc code proceed in a series of steps: 

1. The structure of interest first needs to be fully relaxed. The code expects the relaxed geometry in a 
file called str jrelax . out. This file's format is common to all ATAT tools and is described in [1] as 
well as in the help obtained via mmaps -h. The fitfc command also expects a str . out file containing 
the unrelaxed geometry (which may be the same as the relaxed geometry, if desired). The unrelaxed 
geometry will be used to determine the neighbor shells and measure distances between atoms. The 
rationale for allowing for two types of input structures is that it is easier for most users to specify or 
identify shells of neighbor using "idealized" structures (e.g. hep with ideal c/ a ratio) rather than using 
the fully relaxed structures. Moreover, in the context of cluster expansion construction, where phonon 
calculations on multiple structures are needed, the neighbor shell distances are directly transferable 
between unrelaxed structures but not between relaxed structures. 

Typically the user would specify the str. out and then obtain the str jrelax . out file by running an 
ab initio code with a command of such as runstruct_xxxx, where xxxx stands for the name of the ab 
initio code used (see Section 6 below for calling ab initio codes within ATAT). Of course, the input 
files for the xxxx code must indicate that all degrees of freedom need to be relaxed. Alternatively, file 
str . out and str jrelax . out files could have been automatically generated in a prior mmaps execution 
(in fact, the perhaps odd extension . out reflects the fact that these files are typically outputs of the 
mmaps and runstruct_xxxx codes.). 

2. Next, a set of perturbed geometries must be generated. A typical command line is as follows: 

fitfc -er=11.5 -ns=3 -ms=0.02 -dr=0.1 
The -er option specifies how far apart the periodic images of the displaced atom must be (as measured 
in the unrelaxed structure str. out and in the same units, usually A). The code uses this information 
to finds the smallest supercell satisfying this constraint. While this is the only required parameter, the 
user has control over other parameters. 

-dr specifies the magnitude of the displacement (default: 0.2 A) of the perturbed atom (in the same 
units as in the str . out file, typically A) 

-ns specifies the number of different isotropic strain levels at which phonon calculations will be per- 
formed. Specifying -ns=l implies a purely harmonic model (the default), while values greater than 1 
will invoke a quasiharmonic model. The latter accounts for thermal expansion by allowing the phonon 

®In principle, linear response calculations [21-26] are even more preferable in the sense that they account for infinite range 
interactions, which is important in the case of ionic materials. Linear response calculations are to be performed within the ab 
initio code itself and ATAT consequently does not implement them. 
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frequencies to be volume-dependent. In subsequent steps, the code will determine the volume as a 
function of temperature by minimizing the free energy. 

-ms specifics the maximum level of strain (e.g. 0.02 signifies a 2% elongation along every direction). 
The above command writes out a series of subdirectories of the general form vol_*, one for each level 
of strain (* refers to the usual UNIX wildcard). 

If the structure has cubic symmetry and no internal degrees of freedom or if one only wishes to use the 
harmonic approximation, the fitfc command should be invoked with the -nrr option (do Not ReRelax). 
In these case, the relaxation step 3 below can be skipped. 

3. Each vol_* subdirectory now contains a str . out file which is stretched version of the main str jrelax . out 
file provided. (This naming convention is a bit confusing but is dictated by the fact that these 
vol_*/str .out files are the starting point of another relaxation step.) The ab initio code then needs 
to be invoked to rerelax the geometry at the various levels of imposed strain and obtain the energy as 

a function of strain. Typically: this is achieved by typing:^ 

pollmach -e runstruct_xxxx -w ionrelax . wrap 
The pollmach -e . . . prefix causes the command specified next to be run in all subdirectories con- 
taining a file called wait (a file automatically written by fitfc with this purpose in mind) before 
exiting. Here, the command to be run is runstruct-xxxx, which invokes the ab initio code xxxx. The 
option -w ionrelax. wrap specifies an alternative user-specified input file for the ab initio code, which 
must indicate that all degrees of freedom except volume must be allowed to relax. This additional 
relaxation step is needed because stretching may have modified the equilibrium atomic positions. The 
relaxed (but stretched) structures now reside in the vol_*/str jrelax . out files while the corresponding 
energies will be in vol_*/energy. 

4. Now, one needs to invoke fitfc again to generate perturbations of the atomic position for each level 
of strain, e.g.: 

fitfc -er=11.5 -ns=3 -ms=0.02 -dr=0.1 
This is exactly the same command as before but the code notices the presence on the new files and 
proceeds further. 

At this point the files generated are arranged as follows. At the top level, there is one subdirectory per 
level of strain (vol_*, where * is the strain in percent), and in each subdirectory, there are a number 
of subsubdirectories, each containing a different perturbation. The perturbation names have the form 

p [+/-] [dr] _[er] .[index] , 
where [pertmag] is the number given by the -dr option, [er] the number given by the -er option 
and [index] is a number used to distinguish between different perturbations. Two perturbations that 
differ only by their signs are sometimes generated and are distinguished by a + or - prefix. (These 
opposite-sign perturbations ensure that the effect of third-order force constants cancel out exactly in 
the fit. Note that whenever the third-order terms cancel out by symmetry, the code will recognize 
this and only generate the "positive" perturbation. If one is not concerned with errors introduced 
by third-order force constants and wishes to save time, "negative" perturbations can be ignored at 
this point by deleting the vol_*/p-*/wait files.) Each subdirectory of the form vol_*/p* contains (i) 
the ideal unrelaxed supercell in a str_ideal . out file, (ii) the relaxed but unperturbed supercell in a 
str_unpert . out file and (iii) the actual geometry of perturbed supercell calculation in a str . out file. 

5. The ab initio code must be invoked again, this time to calculate reaction forces for each perturbation. 
This will typically be accomplished by typing the now familiar construct: 

pollmach -e runstruct_xxxx -w force. wrap 
where the force. wrap must now indicate that no degrees of freedom are allowed to relax. Each 
subdirectory of the form vol_*/p* will now contain a force . out file (and a number of other ab initio 
code-specific files not read by fitfc). A str_relax . out is also generated, although it contains the 

"The pollmach command looks for files called wait in all subdirectories as a signal that the specified command must be 
run the directory containing that file. Hence it is prudent to double check that there are no leftover vait files from earlier 
calculations: find . -name wait 

-"'A special note to VASP users: Smearing methods must be used for Brillouin zone integration in force calculations, unless 
the system is insulating. Do not use the DOSTATIC option in the force. wrap file (this is not VASP keyword per se but it is 
interpreted by the runstruct.vasp command.). 
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same geometry as the str.out file since no relaxation took place. This concludes the "generation" 
phase of the process. 

6. Next, the code must fit the force constants and perform the actual phonon calculations. This mode is 
invoked via the -f option: 

fitfc -f -fr=. . . 

In addition, the range of the springs to be included in the fit is specified using the mandatory parameter 
-f r=. . . . Usually, the range specified with -fr should bo no more than half the distance specified with 
the -er option earlier. Distances are measured according to the atomic positions given in str.out 
and in the same units (typically A). It is a good idea to try different values of -f r (starting with the 
nearest neighbor bond length) and check that the vibrational properties converge as -f r is increased. 
The output files will be described below. 

7. Sometimes, the message 'Unstable modes found. Aborting.' is printed. This indicates that the 
structure considered may be mechanically unstable. If, in addition, you see the warning 'Warning: 
p. . . is an unstable mode.', then the structure is certainly mechanically unstable. Otherwise, it 
may be an artifact of the fitting procedure. To find out, the code can generate perturbations along the 
unstable directions and let the ab initio code calculate the reaction forces which can then be included 
in the fit to settle this issue. 

First, to view the unstable modes, use the -fu option. The output will be in voL*/unstable.out and 
has the form: 

u [index] [nb_atom] [kpoint] [branch] [frequency] 
[displacements. . .] 

where [index] is a reference number, [nb_atom] is the size of the supercell needed to represent this 
mode and [displacements] is a vector of 3n elements defining the displacement mode, where n is the 
number of atoms in the unit cell. The other entries are self-explanatory. (If this file contains only 
entries with nb_atom=tooJarge, one needs to increase the -mau option beyond its default of 64.) One 
of these modes can be selected to be written out to disk with the option -gu= [index] where [index] 
is the one reported in the unstable . out file. This new feature aimed at the elimination of fictitious 
unstable modes was first used in [31,32]. 

8. The ab initio code should then be run in the subdirectory generated (named vol_*/p_uns_[dr] _[kmesh] .[number] ) 
and rerun, from the top-level directory, fitfc -f -fr=. . . 

If the message "Warning: p. . . is an unstable mode" appears, then a true instability has been 
found. If only "Unstable modes found. Aborting" is printed, one may repeat the process until the 
message disappears or a truly unstable mode is found. 

Note: The -f n option can be used if one wishes to generate a phonon DOS even if there arc imstablc 
modes. The unstable modes will be shown as negative frequencies. The -f n option eliminates the 
printed error message but not the underlying problem. 

The output files are as follows: 

1. fitfc.log : A general log file. 

2. vol_*/vdos . out : the phonon density of states for each volume considered 

3. vol_*/fc.out : The force constants. For each force constant a summary line, prefixed with "svsl" 
gives: (i) the atomic species involved, (ii) the 'bond length', (iii) the stretching and bending terms. This 

can be useful to check the validity of the length-dependent transferable force constant approximation. 
Then, each separate component of the force constant is given and, finally, their sum. 

4. vol_*/svibJit : gives the high-temperature limit of the vibrational entropy (in units of Boltzman con- 
stant per atom) in the harmonic approximation, excluding the configuration-independent contribution 
at each unit cell volume considered (so, this just —3 times the average log phonon frequency). 

-'-'For some ab initio codes, the ordering of the atoms in str.out and str_relax . out may differ, but fitfc is able to figure 
out how to assign each force to the correct atom, provided the atoms in strjrelax.out and in force. out are in the same order. 
All the rimstruct-xxxx commands provided with ATAT ensure that this is the case. 
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5. f itf c.out : gives along each row, the temperature, the free energy, and the hnear thermal expansion 
(e.g. 0.01 means that the lattice has expansion by 1% at that temperature). 

6. f vib : gives only the free energy (this is the output file that may be subsequently use to construct a 
cluster expansion) 

7. svib : gives only the entropy (in energy/temperature, by default, eV/K) 

8. eosO . out : equation of state at OK (each line reports one level of strain and its associated energy) 

9. eosO . gnu : gnuplot script to plot the equation of state polynomial and data. 

The files within the subdirectories vol_* are based on the harmonic approximation for a given fixed 
volume while the files in the main directory give the output of the quasiharmonic approximation if ns> 1 

and the harmonic approximation if ns= 1. 

Examples studies employing this code can be found in [31-34], among others. 

3.1.2 The fitsvsl and svsl codes 

The fitsvsl code determines bond stiffness vs bond length relationship for the purpose of calculating 
vibrational properties (with the svsl code, described subsequently). Using this pair of codes pays off only 
if vibrational properties of large number of similar structures are needed, because the so-called transferable 
length-dependent force constants can be determined from a small set of structures with the fitsvsl code, 
while vibrational properties can then be quickly predicted for a much larger set of structures with the svsl 
code. 

fitsvsl requires the following files as an input. 

1. A lattice file (lat.in) which allows the code to determine what chemical bonds are expected to be 
present in the system. The format is as described in the mmaps documentation (see mmaps -h). 

2. A file called strngLine . in containing a white spaco^^-scparated list of directories containing structures 
that will be used to calculate force constants. Typically these would numbered directories previously 
generated by mmaps, although user-specified directories can be included. This set of structures must 
be diverse enough so that it contains at least one chemical bond of each type that will be encountered 
with the svsl code. For a binary A-B alloy this typically involves the need for a "pure A" and a "pure 
B" structure, in addition to one with an intermediate composition. Each of the directories listed in 
strname . in must contain: 

(a) a str . out file containing an ideal unrelaxed structure that will be used to automatically determine 
the nearest neighbor shell, 

(b) a str_relax . out file containing the relaxed structure that will be used to calculate bond lengths 
and that the code will perturb in various ways to determine the force constants. 

Like the f itfc code, the fitsvsl code can operate in two modes: a structure generation mode and a 
fitting mode (indicated by the -f option). In structure generation mode, all the above input files are needed 
and the option -er must be specified in order to indicate the size of the supcrcells generated. The -er 
option specifies the minimum required distance between a displaced atom's periodic images. The code will 
automatically infer the smallest supercell satisfying this constraint. Typically, -er should be 3 or 4 times 
the nearest neighbor distance. All of these distances are mcasmed using the ideal structures (str. out). 

Various optional parameters (which have reasonable default values) can be specified: the -dr, -ms and 
-ns options have the same meaning as for the fitfc code. Unlike the f itfc code, the default number of 
different isotropic strain levels considered is 2 because this is the minimum needcid to be able to determine 
the length-dependence of bond stiffness. More than one level of strain is necessary even if only the harmonic 
approximation is desired. Another difference is that the fitsvsl code is typically run from a "top level" 
directory containing multiple structure subdirectories (e.g. as generated by mmaps), because it handles 

i^A "white space" is a space, tab or newline. 
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properties that are transferable across structures. The f itf c code is to be run within a structure directory 
because it appHes to one structure at the time. 

After the structure generation step, each of the directory specified in strnaine . in will contain a hierarchy 
of subdirectories with various perturbed structures with a layout identical to the one described in step 4 of 
section 3.1.1 above. The appropriate first-principles calculations are then carried out as in step 5 of that 
same section. 

After completion of the first-principles calculations, f itsvsl must be invoked in fitting mode (with the 
-f option). The lattice file (e.g. lat.in) must be present and the code will look for data to fit in all 
files of the form */force.out, */*/f orce . out and the corresponding files */str.out */strjrelax.out 
*/*/str.out */*/strjrelax.out. 

The code will then use that information to create the length-dependent force constants (this may take a 
few minutes) and ouputs them in slspring. out. Here is an example of the format of this file: 
Al Al (gives the type of bond) 
2 (2 parameters: linear fit is used) 

(polynomial coefficients of the stiffness vs length relationship for stretching term:) 
50.29 (constant, typically in eV/A^) 
7.89 (slope coefficient, typically in eV/A^) 
2 (idem for bending term, 2 parameters for a linear fit) 

6.13 etc. 
-1.02 

Al Ti (repeat for each type of chemical bond) 

Whenever there is not enough data to fit a given parameter, it will appear as in the above output 
file, so the user must carefully inspect this file. This may happen if there are not enough structures in the 
strneime . in file or if there are too few levels of strain applied (-ns option) or if some ab initio calculations 
aborted without producing a force. out file. 

A few options arc available to control the fitting process. The -op option specifies the order of the 
polynomial used to fit the stiffness vs length relationship (the default is -op=l for a linear relationship). 
More sophisticated models^^ are also available, such as direction-dependent force constants (-dd option) or 
composition-dependent force constants [30] (the -pc option specified the order of the polynomial representing 
the composition-dependence). 

Diagnostic files are also output: The fitsvsl.log file contains all the matrices generated in the fitting 
process, including the actual forces, compared with the ones predicted by the spring model. A plot of the 
stiffness vs. length relationship is available in the files f itsvsl. gnu (a gnuplot script) and f_*.dat the raw 
stiffness data to be plotted. This is helpful for assessing the validity of the transferability assumption, as 
illustrated in Figure 2. 

Note that the f itf c and f itsvsl codes are mutually compatible: The perturbations generated with one 
code can be used in a fit carried out with the other. This is useful if one realizes midway through the process 
that one approach is too slow or the other too inaccurate. 

Once the slspring. out file has been generated with f itsvsl, the svsl code can exploit it to rapidly 
predict vibrational properties of numerous similar structures. Like f itf c, svsl is structure-specific and must 
be run within a structural directory containing the following files (three of which should now be familiar): 

1. a "relaxed" structure file (str_relax.out) and 

2. an "unrelaxed" structure file (str.out). The latter provides the ideal atomic positions that are used 

to automatically determine which atoms lie in the nearest neighbor shell. This file can be the same as 
the relaxed structure but then the determination of the nearest neighbor shell may be less reliable. 

3. In addition, svsl requires a transferable force constant file slspring. out. It looks for this file in the 
current directory and, if not found, it also looks one directory up, where it would typically have been 

])];icccl ]')y an carlioi' f itsvsl command. 

l^A more general format for the slspring. out file then used — contact the author for more information. 
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Figure 2: Bond stiffness versus bond length relationship in the fee Cu-Li system. The proximity of the 
crosses to the solid lines is a strong indication of the model's predictive power in this system. 



4. An optional input file defining the atomic masses can be provided. By default, the code looks in 
the "/.atat.rc file to determine the directory where ATAT is installed and then loads the file 
data/masses . in. This behavior can be overridden with the -m option. 

5. By default, the bulk modulus is calculated from the force constants (which is quick but potentially 
inaccurate) but it can also be read from a file (whose name is specified with the -bf option) or specified 
on the command line with the -b option. 

A few parameters govern the accuracy of the calculations. (The default values are all reasonable starting 
points.) 

1. By default the code uses the harmonic approximation, but thermal expansion can be accounted for by 
specifying the -ns option with a value ns greater than 1. In that event, svsl calculates the vibrational 
free for ns different lattice parameters ranging from its equilibrium OK value to a value (1 + ms) times 
larger, where ms is specified via the -ms option (the default is 0.05). Typically -ns=5 is sufficient, but 
the computational cost of using higher values is minimal. 

The code calculates the vibrational free energy for temperatures ranging from TO to Tl in steps of dT, 
where these three values are determined as follows. 

(a) The defaults are TO = 0, Tl 2000, dT = 100. 

(b) If a Trange . in file exists in the next upper directory, it is used to set TO, Tl, dT: It must contain 
two numbers on one line separated by a space: Tl (Tl/dT + 1). Note that TO = always. 
For phase diagram calculations, this method must be used to specify the temperature range 
because it ensures that all calls to svsl (and to other codes described subsequently) use the same 
temperature range. 

(c) The above values can be overridden by the -TO=TO, -T1=T1 and -dT=ciT options. 
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2. The fc-point sampling grid is specified with the -kp=fcp option. The actual number of fc-points used 
is at least kp divided by the number of atoms in the cell. An internal algorithm builds a fc-point grid 
that equalizes as much as possible the fc-point density along all directions in space 

The output files are as follows: 

1. svsl . log : a log file giving some of the intermediate steps of the calculations in particular, the predicted 
bulk modulus which should be checked for accuracy. 

2. vdos.out: the phonon density of states for each lattice parameter considered (unstable modes appear 
as negative frequencies). 

3. svsl. out: gives along each row, the temperature, the free energy, and the linear thermal expansion 
(e.g. 0.01 means that the lattice has expansion by 1% at that temperature). 

4. f vib: similar to svsl . out but gives only the free energy. 
Examples of studies employing this code can be found in [34-36] 

3.2 Electronic excitations 

The f elec code calculates the electronic free energy within the one-electron and temperature-independent 
bands approximations. It expects a dos . out input file containing an electronic density of states and the num- 
ber of electrons to populate it. This file is typically generated by the ab initio call up utilities runstuct_xxxx 
(see Section 6). It is rarely needed to use anything but the default option: 
felec -d 
The output files are: 

1. felec.log, containing the temperature and corresponding free energy (per unit cell) on each line. 

2. felec is similar to felec.log but only contains the free energies. 

3. plotdos . out contains the dos in a format suitable to be plotted, with the energy normalized so that 
the Fermi level is at 0. 

It should be noted that the above approximation are not always satisfactory, notably in magnetic systems. 
There are two (currently experimental) magnetic free energy calculators included in ATAT: 

1. fmag, which relies on a simple quantum mechanical mean field model and 

2. f empmag, which implements the widely used semi-empirical method introduced in [37]. 

3.3 Cluster expansions of nonconfigurational free energies 

The f itf c, svsl and felec codes enables the calculation of free energies of individual configuration on a 
lattice. However, to properly calculate the free energy of system exhibiting both configurational and noncon- 
figurational disorder, a cluster expansion needs to be constructed to be able to predict the nonconfigurational 
free energy of any given configuration. This cluster expansion can then be provided as an input to the Monte 
Carlo code meinc2 which will now be able to properly account for both configurational and nonconfigurational 
contributions to the free energy. Here is how to proceed. 

1. First create a Trange.in file indicating the temperature range to be sampled. For instance, if one 
wishes to sample from OK to 2000K in intervals of lOOK, type 

echo 2000 21 > Trange.in 
(The lowest temperature is always OK and note that 2000/(21 — 1) = 100.) 
This ensures that all calls to f itf c, svsl and felec with use the same temperature range. 
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2. ATAT provides the f oreachf ile utility to repeat the same command in all subdirectories containing 
a specified file. For instance, to calculate the vibrational free energy using svsl in all structure subdi- 
rectories, one could type 

f oreachf ile -e strjrelax.out svsl -d 
This executes the svsl code with the default options (-d) in each subdirectory containing a str jrelax . out 
file while skipping directories with an error file (-e). If desired, a similar process can be used for the 
electronic free energies: 
f oreachf ile -e dos.out felec -d 

3. Next, construct a cluster expansion of the vibrational free energy data contained in the */f vib files: 

clusterexpand fvib 

Note that the files */fvib actually contain a column of numbers (one for each temperature) — 
clusterexpand simply produce a separate cluster expansion for each scalar element in */fvib. By 
default, this commands reads in the clusters from the clusters. out file, as they were generated ear- 
lier by mmaps.^^ The user has some control over which clusters to include (1) or exclude (0) via the 
-s= [comma-separated string of Os and Is] option. The user can also check if the crossvalidation 
score, reported with -cv option, can be improved by changing the cluster selection. A similar expansion 
can be done for electronic contributions: 
clusterexpand felec 

4. Finally, combine the energy cluster expansion from eci . out (generated by mmaps) with the vibrational 
(fvib.eci) and electronic (felec. eci) cluster expansions created in the previous step into a single 
"master" cluster expansion file (teci.out) that will be read by the Monte Carlo code: 

mkteci fvib.eci felec. eci 
For this step to succeed it is essential that all clusterexpand commands were run with the same 
mmaps-generated clusters . out file (although different clusters can be included or excluded with the 
-s option of clusterexpand without any problem). 

5. Run memc2. It is often instructive to try and compare simulations that include or exclude some 
components of the nonconfigurational free energy. To this effect, one can simply rerun mkteci with 
the list of contributions to include before rerunning memc2. 

Example of the use of this methodology can be found in [34-36]. For easy reference. Figure 3 shows a 
flowchart of the entire procedure while Figure 4 showcases a practical example of this methodology. 

4 Special Quasirandom Structure generation 

The use of the cluster expansion is essential to properly model nondilute alloys exhibiting configurational 
disorder with potential short-range order ([39]). However, at sufficiently high temperatures, the limiting 
case of a fully random alloy provides a useful approximation. Although this limit can be calculated with 
a cluster expansion, it is sometimes convenient to directly determine high-temperature limiting quantities 
with a small number of supercell ab initio calculations without first constructing a cluster expansion. The 
concept of Special Quasirandom Structures (SQS) [40] provides an efficient way to take this route. An 
SQS represents the best periodic supercell approximation to the true disordered state for a given number 
of atoms per supercell. The quality of an SQS is measured in terms of the number of correlations of the 
fully disordered state it is able to match exactly. Typically, one attempts to preferably match shorter- 
range correlations while gradually enlarging the supercell to extend the range of matching correlations until 
convergence of the properties of interest. Traditionally. SQS are built based on matching the pair correlations, 
although for best performances, multibody correlations should be considered as well. The SQS generation 
utility included in ATAT (gensqs) can account for any user-specified set of correlations. 
The SQS generation process in ATAT (as used, for instance, in [16,39,41]) is as follows: 

-'^If the user wishes to use clusterexpand without first running mmaps, a suitable clusters. out file can be generated with 
the a command of the form corrdump -clus -2= [max pair radius] -3= [max triplet radius] etc. A lat . in file still needs 
to be supplied. 
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Figure 3: Flowchart of ATAT's approach to the integration of configurational and nonconfigurational con- 
tributions to the thermodynamics of alloys. 



1. Prepare a lat . in file (just like the one used for the mmaps and memc2 codes). 

2. Prepare a file (say, cone . in) specifying a structure having the same overall composition as the disor- 
dered state of interest. The exact geometry of this structure is irrelevant — it only needs to have the 
correct overall composition and be a proper superstructure of lat . in. Each site must be occupied by 
one and only one atom. For instance, given the lattice of Table 1, a valid cone. in file could be: 

1 1 1.05 90 90 90 
0.0 0.0 
1.0 0.0 
0.0 1.0 
Fe 
0.5 0.5 C 
5 C 

C.Vae 
5 Ni 
C.Vae 
C.Vae 
5 C.Vae 

This way of specifying composition was selected because it allows the user to individually control the 
composition on each sublattice. 

Select a small number of correlations to match exactly and compute their value in the disordered state: 

eorrdump -noe -rnd -s=eone.in -2= [pair max radius] — 3= [triplet max radius] > tcorr.out 
This command writes the nonempty (-noe) correlations of the disordered state (-rnd) having the same 
composition as the structure eone.in into the file teorr.out. Pair correlations up to distance [pair 
max radius] and triplet correlations up to [pair triplet radius] are considered (larger clusters 
can also be specified with -4=. . . , etc.). This command also writes a cluster file clusters. out that 
will be read in the step. 

The SQS generation step is initiated with, for instance, 

gensqs -n=16 > sqsl6.out 
which generates 16-atom SQS and puts them into the sqsl6.out file. If this file is empty, the range 
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Figure 4: Ab initio thermodynamics of the Cu-Li system [34]. a) Ground state convex huU (merging bee and 
fee superstructures) . b) Calculated free energy as a function of composition including both configurational, 
vibrational and electronic contributions. Nonconfigurational contributions are essential to accurately predict 
the numerous phase transitions between bcc- and fcc-based phases. Two-phase equilibria are marked by 
thick gray tie-lines, c) Calculated electromotive force (emf) inferred from the calculated free energy in b) 
and compared with experimental data. (The solid-liquid Li free energy difference was left as a parameter 
adjusted to provide the best agreement.) d) Calculated phase diagram obtained via tangent constuctions as 
in b) over a range of temperatures. The liquid phase excess free enegy is taken from the COST database [38]. 
Note that most of the ground states from a) disorder at low temperature and are not visible in the phase 
diagram. Interestingly, neither the magnitude of each ground state's formation energy nor the sharpness 
of the vertex where a ground state touches the hull bear any correlation with a ground state's disordering 
temperature. The B32 ground state at 50% (shown in inset) does not have the most negative formation 
energy and barely breaks the hull. Yet it exhibits tremendous stability due to entropy-driven stabilization. 
The dashed line marks the location of a second-order transition, determined by locating a peak in the heat 
capacity. 
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of correlations to be matched exactly needs to be reduced in step 3. A word of caution: gensqs only 
generates structures containing exactly the number of atoms per unit cell specified by the -n option 
(i.e., if an SQS with a smaller unit cell exists, it will not be listed). 

5. The output file sqsl6.out may contain a large number of candidate SQS. While these a match the 
specified criteria, they may differ in quality in terms of the correlations not included in the screening 
process. Further screening can be performed with the command 

corrdump -2= [another radius] -3= [another radius] -4= [another radius] -noe -s=sqsl6.out 
which will output the correlations of all SQS in sqsl6 . out. Clearly, the radii specified should be larger 
than in step 3 in order to distinguish and rank these SQS. 

While there are no formal rules to decide whether it is preferable to match more pairs or more multi- 
body correlations, general guidelines can be given. In general, it would be; unusual to match longer range 
n-body correlations than m-body interactions if n > m. In systems exhibiting large relaxations (due to large 
atomic size mismatches), multi-body correlations are expected be important, while in ionic systems, pair 
interactions are expected to dominate. 

SQS generation via exhaustive enumeration is time-consuming. This can be alleviated by exploiting 
parallelism. Multiple independent copies of gensqs can be run on separate processors if one specifies 
the -ip= [index] and -tp= [number of processes] options. The [index] runs from to [number of 
processes-1] and is used to tell each copy of gensqs which portion of the space to scan. For maximum 
efficiency, [number of processes] should be a factor of the total number of possible supercells, which can 
be determined, e.g., with the gensqs -n=16 -pc command. 

Another way to improve SQS generation speed is via stochastic sampling [42]. Such an approach seeks 
to quickly find close-to-optimal solutions rather than slowly finding an exact solution. Another approach 
is to scan through configurations in such as way that only the desired composition is sampled, using atom 
permutations rather than atom "transmutations". This scheme becomes increasingly advantageous as one 
moves away from equiatomic compositions. Yi Wang contributed a useful code implementing these two 
algorithms (that wore used in [16]), which will be included in the ATAT distibution shortly. 

Finally, we note that SQS prove useful in the construction of a cluster expansion as well. Any generated 
SQS can be included into the set structures used to fit a cluster expansion as a way to ensure that the 
properties of the disordered state are properly reproduced. In practice, one; can merely place the desired 
SQS into a str.out file within a subdirectory and run an ab initio code on it with runstruct_xxxx. The 
mmaps code will readily read in any user-specified structure placed in its directory hierarchy (provided the 
lat . in files used in mmaps and gensqs are identical). 

5 The tensorial cluster expansion 

While the formalism enabling the cluster expansion of tensors has been derived in [43], we provide here a 
brief explanation of how to use this feature. In Section 3.3, it was shown how to cluster expand a scalar — 
the process is similar for tensors. 

1. Create a lat. in providing the lattice and a gcetensor.in file describing the type of tensor in the 
form: 

[rank] 

[list of pairs of indices indicating which simultEineous index permutations leave 
the tensor invariant] 

[next list , etc . . . ] 

To fix the ideas, here are a few examples. For strain or stress the gcetensor . in should be: 
2 

1 (indicating that such tensor is symmetric, i.e., Sij = eji) 
while for elastic constants, it should be: 
4 

1 (indicating djki = Cjiki) 
2 3 (indicating Cijki = Cijik) 
2 13 (indicating Cijki = CkUj). 
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2. Generate a list of clusters with the command 

gee -clus -2= [max pair radius] -3= [max triplet radius] etc. 
Note that gee (which stands for Generalized Cluster Expansion) admits essentially the same syntax 
as corrdump in the scalar case. 

3. Calculate the property tensor associated with each structure. This is highly application-dependent. 
For instance, to calculate the static lattice strain, one could use (assuming ab initio calculations have 
been run in each subdirectory): 

foreachfile -e str jrelax . out analrelax -d ">" strain 

To calculate the elastic constants one could use: 

foreachfile -e str_relax.out calcelas -d (this generates perturbations) 

pollmach -e runstruct_xxxx -w force. wrap (this calculates the induced reaction stresses) 

foreachfile -e str_relax . out calcelas -f ">" elas (this fits the elastic tensor and writes 

it in the file elas.) 

4. Finally, do the cluster expansion itself with: 

clusterexpand -pa -g strain 

clusterexpEind -pa -g elas 
the -g invokes the generalized cluster expansion option while the -pa indicates that the quantities to 
be expanded are intensive (i.e. per atom). 

6 Interfaces with ab initio codes 

All interfaces have the general name runstruct_xxxx. Currently, the following are supported: runstruct_vasp 
(for the VASP code [44,45]) runstruct_abinit (for the abinit [46,47] code), runstruct_gulp (for the 
GULP code [48,49]). In addition, Monodeep Chakraborty, Jiirgen Spitaler, Peter Puschnig and Claudia 
Ambrosch-Draxl have written an interface with WIEN2k [50] , which will be publicly available shortly. Finally, 
runstruct_pwscf (for the PWscf code [51]) is under development. 
Each interface has the following characteristics in common: 

1. It reads the geometry of a structure from the str. out file in current directory, written in the ATAT 
format. If a strJiint.out file exists, it takes precedence over the str. out file. This is useful to 
provide educated guesses of the relaxed geometry via external user-supplied codes. 

2. It reads some code-specific parameters from a file called xxxx.wrap (or specified with -w [file]) 
located in the current directory or up to two levels above. This separation between the geometry and 
calculation parameter input files is essential to ensure that all the pieces of ATAT are fully interoperable. 

3. It runs the appropriate ab initio or atomistic code. If an argument is given to the runstruct_xxxx 

this string is used as prefix to the ab initio command. This feature is used to run the appropriate code 
remotely without having to install ATAT on the compute nodes. 

4. It writes the structure's energy in the file energy and the structure's relaxed geometry in str_relax . out. 
(If no relaxations are allowed, then the unrelaxed geometry is written in str jrelax. out.) 

5. It writes the forces acting on all atoms in forces . out, the stress acting on the cell in stress . out and 
the electronic density of states in dos . out (although these are not yet implemented in all runstruct-xxxx 
commands). 

6. If anything goes wrong with the calculations, a file called error is written. 

Typically the ab initio codes arc not called one run at the time but rather as a batch of many jobs. The 
pollmach command manages such pools of jobs. The basic principle is simple: If a file called wait exists 
in a subdirectory, pollmach will find it and run the command specified on the command line within that 

^^This list of structures could come from a previous minaps run, for instance, or it could be generated manually using the 
genstr command. 



20 



directory. For instance, 

pollmach runstruct jcxxx 

will repeatedly wait for a wait file to be found, delete it. and then run runstruct_xxxx with the corresponding 
directory. In a multiprocessor environment, pollmach can simultaneously dispatch different jobs to different 
processors. This simple mechanism allows most of the ATAT codes to be completely platform-independent, 
with pollmach being the only platform-dependent piece. 

The first time pollmach is run, default configurational files are set up and the user will be asked to 
tailor them to the local computing environment. The ATAT distribution includes many examples of setups, 
including the increasingly common case of a large pool of processors to be divided up into subgroup that 
each run a separate parallel version of an ab initio code. 

7 Utilities 

We now briefly mention some utilities to give users direct access to some of the inner algorithms of ATAT. 

1. corrdump finds symmetry operations, enumerates clusters, calculates correlations (including of the 

disordered state), etc. 

2. genstr enumerates superstructures of a given lattice. 

3. pdef generates substitutional point defect supercells. 

4. cellcvrt manipulates ATAT-formatted structure files, changing coordinate systems, converting frac- 
tional to cartesian coordinates, finding supercells and subcells, etc. 

5. Isf it implements least-squares fitting. 

6. nnshell finds nearest-neighbor shells. 

7. A number of text parsing utilities: getvalue, getlines, sspp. 

More information regarding these utilities can be found in their respective help files (accessed by specifying 
the -h option). 

8 Conclusion 

This completes this overview of the various new features added to ATAT in recent years. What's next? 
Probably: 

1. A tighter integration between the output of the Monte Carlo code and thermodynamic databases for 
use with softwares such as Thermocalc and Pandat; 

2. Material propery optimization modules exploiting the tensorial cluster expansion; 

3. More automated ways to include nonconfigurational free energy; 

4. Better electronic free energy calculators. 

However, in large part, what will be next will depend on what users express interest in and what the 
author can get funding for. . . 
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A Correlation to concentration conversions 

ATAT calculates the matrices C and X and vectors cq and xq in Equations (4) and (5) as follows. 

1. Start enumerating all structures (in order of increasing unit cell size) while computing the point cor- 
relation vector p for each of them. Let p denote the corresponding point correlation vector with a 
constant element appended to it. Skip any structures whose p is colinear with the p of earlier struc- 
tures. Terminate this step when the number of structures kept is equal to the dimension of p. 

2. For each of the kept structures, calculate its full concentration vector x. Create a matrix A by pasting 
the column vectors x just obtained side by side. Similar, create a matrix B by pasting the column 
vectors p previously obtained. Note that B is square and invertible by construction and calculate 
X = AB-'^. 

3. All columns of X but the last provide X while the last column of X gives xq. 

4. Eliminating all colinear rows from X gives C while eliminating the corresponding elements of xq gives 
Cq. (Note: in ATAT, cq is just set to zero since this makes no difference for convex hull construction. 
But xq is fully calculated.) 
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